A category of variables consists of:
From the second and third axioms, it follows by mathematical induction that contains at least one set of each finite cardinality; thus the inclusion functor is an equivalence of categories.
The intuition is that each object of is a possible “set of free variables” that may occur in a term. In a moment we will define the set of “terms with free variables in ” for any . The fresh extensions are the allowable ways to “add a new free variable”, such as when extending a context or passing under a binder. In particular, this means that in practice the only objects of that will be relevant are those obtainable from by iterated fresh extension.
Fresh pushouts tell us how to transform “new free variables” under structural rules such as weakening, exchange, and contraction applied to the previous free variables. Similarly, fresh pushout complements tell us how to transform “new free variables” under substitution for a previously free variable. This will become more clear later. Note that just as has all pushouts, it also has all pushout complements of (complemented) injections; thus the content of the fresh pushouts and fresh pushout complements is not that they exist, but that they are specified and that the morphism across from the given fresh extension is also a fresh extension.
First we have the tautological example.
Let , let every injection with cocardinality 1 be a fresh extension, and choose pushouts and pushout complements arbitrarily.
Now a couple of examples using “named variables”. For both of them, let be any infinite set (the “variable names”) equipped with an operation such that and , i.e. a way to “freshen” a given variable name with respect to a given finite set of existing variables.
Let consist of all finite subsets of . The fresh extensions are the literal subset inclusions (not mere injections) with cocardinality 1.
This category of variables roughly represents the use of named variables with a “local Barendregt convention” that the name of any bound variable differs from all other free variables in scope at the point of binding (though distinct bound variables with non-overlapping scopes might still have the same name). The issue in this approach is to ensure that the convention is maintained under weakening: adding a new free variable might force us to rename a bound variable to avoid conflict.
In our abstract setting, this renaming is accomplished by the definition of fresh pushout. Given a fresh extension with fresh name and a function , if then the fresh pushout has vertex , while if then it has vertex . The definition of fresh pushout complement is similar.
Let consist again of all finite subsets of , but now let the fresh extensions of consist of the subset inclusions with cocardinality 1 together with the maps of the form
for any .
This category of variables roughly represents a different way of using of named variables: a bound variable is allowed to have the same name as a variable already in the context, but instead of literally “shadowing” the latter, the latter is given a modified name (such as or ) only in the scope of the new binder. The fresh extensions that are not subset inclusions “carry along the information” about this renaming, so that in particular when substituting inside the scope of that binder any references to the outer free variable can automatically be suitably renamed (to avoid variable capture).
In this example, given a fresh extension with fresh name and a function , if then the new fresh extension is the subset inclusion , while if then it is of the other form determined by and . The definition of fresh pushout complements is similar.
Inspecting these examples and comparing them to the general definition, we can conclude that in general a fresh extension can include information about how both a variable and a given finite set of variables must be renamed in order to make the former fresh for the latter. The same point is made by the de Bruijn examples:
Let consist of all sets , with the unique fresh extension of being (so the fresh name is always ). Since every object has a unique fresh extension, fresh pushouts and fresh pushout complements are uniquely determined. This corresponds to de Bruijn indices: newly bound variables are denoted , with all existing variable numbers incremented. The fresh extensions again carry along the information of how free variables in terms being substituted must be modified when passing under a binder.
Let consist again of all sets , but now the unique fresh extension of is the inclusion (for which the fresh name is ). Again, since every object has a unique fresh extension, fresh pushouts and fresh pushout complements are uniquely determined. This corresponds to “de Bruijn levels”.
We refer to a composable sequence of fresh extensions as an -ary fresh extension . (Note that an -ary fresh extension is by definition the list , not its composite , although we will sometimes treat as an implicit coercion.) Since pushout squares compose, we can say that every -ary fresh extension has a fresh pushout along any function which is again an -ary fresh extension, and similarly any -ary fresh extension has a fresh pushout complement relative to any injection of cocardinality 1 which is again an -ary fresh extension.
We will work at the generality of the abstract binding trees of Harper (Practical foundations of programming languages). Thus, we suppose given a set of sorts, and also a set of operators. Moreover, each operator is assigned a signature or generalized arity of the following form:
where each is a sort. The meaning is that such a operator is of sort and takes arguments, where the argument binds arguments with sorts and has sort . Thus we would write an application of it as
Put differently, if we regard the sorts as the base types (zeroth-order types) in a simply typed lambda-calculus, then each signature is a second-order type: an iterated function type whose arguments are themselves iterated function types whose arguments are base types.
For instance, to represent the raw syntax of dependent type theory (with fully annotated terms), there would be two sorts and , with operators such as and .
A sort context consists of an object and a function . We define by mutual induction sets of terms, for a sort context and a sort . The clauses are:
If and , then .
Let be a operator of signature , and let be a sort context. For each , choose an -ary fresh extension , and extend to by assigning to the fresh name of each the sort . Suppose furthermore we have . Then we have a new term .
Given any and , we define for any and with an action
by recursion on . (Note that the notation is technically ambiguous as the context is not specified. If is surjective, then is uniquely determined by , but in general this need not be.)
If with , then also , so we can set .
Given as above, for each we choose an -ary fresh pushout consisting of an -ary fresh extension and a function , and define . Here we extend the context to using the same sorts in the signature of that extend to .
It is straightforward to prove by induction that for any . However, it will only be symmetric and transitive up to -equivalence.
We usually think of -equivalence as a “homogeneous” binary relation on each set . However, the inductions are much cleaner if we first define it in “heterogeneous” form, as a family of relations between and for any bijection such that . These relations are defined inductively as follows.
For any with , we have . (Note that by assumptions on , it is equivalent to say that for any with we have .)
Suppose given in and in . Note that if and are two fresh extensions and is a bijection, then there is a unique bijection such that and preserves the fresh names. Applying this repeatedly for each , we obtain a sequence of bijections starting with and ending with . If , then we assert .
I conjecture that if the inductive family is defined in homotopy type theory where is a univalent category, then this definition of -equivalence coincides with their ordinary (dependent) equality types.
It is straightforward to prove that each homogeneous relation is reflexive, and they are “congruences” for all the operators essentially by definition. Note that in general, the definition of at operators with binding involves for non-identity . However, this is not the case in the de Bruijn examples where every object has a unique fresh extension, and indeed in this case it is not hard to see that reduces to simple equality, as expected.
We now prove successively that they are congruences for the structural rules, are functorial, and are equivalence relations.
We prove by induction on that if then .
In the case , we have .
Suppose because for all , as in the definition. Then the inverses are the comparison maps in the other direction from to , and by the inductive hypothesis we have for all , which is what we need to conclude
We prove by induction on that if and then .
In the case and , we have .
Suppose because for all , and similarly because for all . Then by the inductive hypothesis, we have , and the are the comparison maps for . Thus this is what we need to conclude .
Suppose given a commutative square , where are bijections. We prove by induction on that if then .
In the case , we have , hence .
Suppose because for all , as in the definition. The -ary fresh pushouts of each along and each along give us -ary fresh extensions and and functions and . Now by pasting of pushouts we can uniquely fill in the sequence of bijections ending in . By the inductive hypothesis, we have for each , which by definition is what we need to ensure .
Finally, we prove by induction on that whenever , with a bijection, we have .
Given , we have .
Given for , for any we have the -ary fresh pushout of , the -ary fresh pushout of , and also the -ary fresh pushout of . By the universal properties of pushouts, the bijection extends to a unique sequence of bijections between the sequences and that commute with everything, which are also the unique comparison map between the -ary fresh extensions and induced by as in the definition of . Thus, the inductive hypothesis ensures for all , which is what we need to conclude .
It now follows that the relations are equivalence relations and congruences, and if we quotient the sets by these equivalence relations, the actions become functorial. However, we will not actually perform this quotienting at this stage; it seems to cause more trouble (see “general judgments” below) to do so prematurely here.
It does also follow that the heterogeneous -equivalence is reducible to the homogeneous one: we have if and only if .
Let be an injection of cocardinality 1 with , let and , and let . We define the substitution by induction on .
If , then , which is well-typed since then .
If , then for a unique , and we define .
If , for each we have the -ary fresh pushout complement of and consisting of an -ary fresh extension and a function of cocardinality 1. The signature of extends to , which thereby restricts to which extends along . Thus, we can apply the functorial structural rules to get , while by definition we have . So the inductive hypothesis gives us , which we can then use to define to be .
Note how it is impossible for any variables to be “captured”, because the variables in get “renamed” (if necessary) by the weakenings . Recall that in Examples and the fresh extensions carry the information of how the variables in the context have to appear with different names inside a binder; while in Examples and where the fresh extensions are subset inclusions, the bound variables are always distinct from those in the context. The categorical and strongly-typed framework frees us from having to think about which of these situations we are in and prevents us from accidentally making a wrong definition of substitution that would capture variables: itself does not have the correct variable-set and sort-context to be substituted into ; we must weaken it by first.
Suppose given a commutative square , where are injections of cocardinality 1 and are bijections, with sort-contexts with , terms , and . We prove by induction (on ) that .
Note that if and , we must have . Thus, in the case we must have , hence and , so we are done by the assumption .
In the case of other variables , we have and , so we conclude by .
Suppose because for all . Because pushout complements (of monomorphisms in an adhesive category) are unique up to unique isomorphism, the isomorphisms induce isomorphisms . The compatibility of structural actions with -equivalence implies that , so the inductive hypothesis yields , whence by definition.
Let us specialize to the case of dependent type theory, with two sorts and . Note that is a sort, while is the set of abstract binding trees of sort ; thus the “raw terms” of DTT are and the “raw types” are . Since we do not generally hypothesize types, we consider only sort-contexts that assign the sort to every variable; we henceforth omit to mention these sort-contexts. Thus we speak of raw terms with variables and raw types with variables .
We define the raw contexts with variables , for any , inductively as follows.
There is a unique raw context with variables , the empty context .
If is a raw context with variables and is a raw type with variables , and is a fresh extension, then is a raw context with variables .
Similarly, if is a bijection and are raw contexts with variables respectively, we define inductively:
If and , and is the unique bijection induced by and the fresh extensions and , then .
Note that this “variable renaming” of contexts includes heterogeneous -equivalence (renaming bound variables) of the types occurring in the context. Thus, its homogeneous version is exactly -equivalence of all the types.
The basic judgments of DTT are:
Recall that we did not actually quotient our raw syntax by -equivalence. Thus there is no problem in stating a general judgment rule such as
Here denotes the same thing in both premise and conclusion: a fresh extension of the set of variables of . We write in the premise meaning the explicit weakening (structural action) of along .
We deal with -equivalence by proving afterwards by induction that all our valid judgments are invariant under context renaming / heterogeneous -equivalence. For instance, the part of this result referring to the term rule is that whenever , and , if then also . And the corresponding inductive clause for the above -abstraction rule is:
Perhaps one could formulate a general notion of “judgment” and “rule” that would allow proving this sort of lemma once and for all, rather than piece by piece about every rule in a particular theory, analogous to how abstract binding trees can deal simultaneously with all the binding constructions of all theories (, , , , etc.).
I expect it should be possible to prove that the sets and all their structure are independent, in a suitable sense, of the choice of . Probably the easiest way to do this would be to translate from an arbitrary to the tautological example and show that this translation is an isomorphism (up to -equivalence).
Our term syntax, for a general , is very similar to the way terms with binding are usually written. Of course there is the usual syntactic desugaring of to and so on. Also, in some cases one traditionally binds a single variable (or, in our case, a fresh extension) in two or more subterms (such as a fully-annotated abstraction where is bound in both and ), whereas we have to desugar this to copy the variable (e.g. ). These trivialities are common to other approaches to abstract binding as well.
More significantly, consider the three places that the same notion of “variable name” is used in traditional named syntax:
In de Bruijn syntax, natural numbers are used in cases (1) and (3), while at (2) there is no “variable name” used at all. In our general framework, all three of these situations take different inputs: for (1) we give an element of a set , for (2) we give a fresh extension of the ambient set of free variables, and for (3) we give an injection of cocardinality 1 into the current set of free variables.
We could coalesce (3) to (1) by equipping our s with a further operation: given and , a specified injection whose complement is . In the named-variable Examples and would be the subset inclusion of , while in the de Bruijn Examples and it would be the ordered injection that omits . This would probably work just as well; there seems no reason that substitution into one variable needs to be allowed to rename the other variables.
We could also make (1) and (2) appear to coalesce by declaring “the fresh name of a fresh extension” to be an implicit coercion. Then we could write a term like where the first occurrence of denotes a fresh extension, while the latter two automatically coerce it to its fresh name.
However, there seems no way around the fact that our syntax requires all variables in the context to be explicitly (potentially) renamed every time they pass under a binder. Thus even with this coercion convention we cannot write , since the fresh extension might rename the fresh name of the fresh extension , and inside the binder we can only use elements of as terms. We have to instead write something like . This is still somewhat easier for a human to read than de Bruijn syntax, since the binders are referred to by name rather than by number, and it’s probably not unreasonable when doing formal theory (e.g. the explicit weakening in the abstraction rule above) — for the Initiality Project I’m tempted to use this formalism explicitly, as the variable sets correspond fairly closely to the semantic interpretations I have in mind.
However, for actual practice in a type theory this syntax is more verbose than one would want, so one would probably choose a particular “implementation” — e.g. the Barendregt convention (Example ) for human readers, or de Bruijn indices (Example ) for a computer. The expected uniqueness theorem would enable translation between them as needed.
Last revised on November 7, 2018 at 20:03:59. See the history of this page for a list of all contributions to it.